import pandas as pd
import numpy as np
import plotly.express as px
data = pd.read_csv("deliverytime.txt")
print(data.head())
ID Delivery_person_ID Delivery_person_Age Delivery_person_Ratings \ 0 4607 INDORES13DEL02 37 4.9 1 B379 BANGRES18DEL02 34 4.5 2 5D6D BANGRES19DEL01 23 4.4 3 7A6A COIMBRES13DEL02 38 4.7 4 70A2 CHENRES12DEL01 32 4.6 Restaurant_latitude Restaurant_longitude Delivery_location_latitude \ 0 22.745049 75.892471 22.765049 1 12.913041 77.683237 13.043041 2 12.914264 77.678400 12.924264 3 11.003669 76.976494 11.053669 4 12.972793 80.249982 13.012793 Delivery_location_longitude Type_of_order Type_of_vehicle Time_taken(min) 0 75.912471 Snack motorcycle 24 1 77.813237 Snack scooter 33 2 77.688400 Drinks motorcycle 26 3 77.026494 Buffet motorcycle 21 4 80.289982 Snack scooter 30
Let’s have a look at the column insights before moving forward:
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 45593 entries, 0 to 45592 Data columns (total 11 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 ID 45593 non-null object 1 Delivery_person_ID 45593 non-null object 2 Delivery_person_Age 45593 non-null int64 3 Delivery_person_Ratings 45593 non-null float64 4 Restaurant_latitude 45593 non-null float64 5 Restaurant_longitude 45593 non-null float64 6 Delivery_location_latitude 45593 non-null float64 7 Delivery_location_longitude 45593 non-null float64 8 Type_of_order 45593 non-null object 9 Type_of_vehicle 45593 non-null object 10 Time_taken(min) 45593 non-null int64 dtypes: float64(5), int64(2), object(4) memory usage: 3.8+ MB
Now let’s have a look at whether this dataset contains any null values or not:
data.isnull().sum()
ID 0 Delivery_person_ID 0 Delivery_person_Age 0 Delivery_person_Ratings 0 Restaurant_latitude 0 Restaurant_longitude 0 Delivery_location_latitude 0 Delivery_location_longitude 0 Type_of_order 0 Type_of_vehicle 0 Time_taken(min) 0 dtype: int64
The dataset doesn’t have any feature that shows the difference between the restaurant and the delivery location. All we have are the latitude and longitude points of the restaurant and the delivery location. We can use the haversine formula to calculate the distance between two locations based on their latitudes and longitudes.
Below is how we can find the distance between the restaurant and the delivery location based on their latitudes and longitudes by using the haversine formula:
# Set the earth's radius (in kilometers)
R = 6371
# Convert degrees to radians
def deg_to_rad(degrees):
return degrees * (np.pi/180)
# Function to calculate the distance between two points using the haversine formula
def distcalculate(lat1, lon1, lat2, lon2):
d_lat = deg_to_rad(lat2-lat1)
d_lon = deg_to_rad(lon2-lon1)
a = np.sin(d_lat/2)**2 + np.cos(deg_to_rad(lat1)) * np.cos(deg_to_rad(lat2)) * np.sin(d_lon/2)**2
c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1-a))
return R * c
# Calculate the distance between each pair of points
data['distance'] = np.nan
for i in range(len(data)):
data.loc[i, 'distance'] = distcalculate(data.loc[i, 'Restaurant_latitude'],
data.loc[i, 'Restaurant_longitude'],
data.loc[i, 'Delivery_location_latitude'],
data.loc[i, 'Delivery_location_longitude'])
We have now calculated the distance between the restaurant and the delivery location. We have also added a new feature in the dataset as distance. Let’s look at the dataset again:
print(data.head())
ID Delivery_person_ID Delivery_person_Age Delivery_person_Ratings \
0 4607 INDORES13DEL02 37 4.9
1 B379 BANGRES18DEL02 34 4.5
2 5D6D BANGRES19DEL01 23 4.4
3 7A6A COIMBRES13DEL02 38 4.7
4 70A2 CHENRES12DEL01 32 4.6
Restaurant_latitude Restaurant_longitude Delivery_location_latitude \
0 22.745049 75.892471 22.765049
1 12.913041 77.683237 13.043041
2 12.914264 77.678400 12.924264
3 11.003669 76.976494 11.053669
4 12.972793 80.249982 13.012793
Delivery_location_longitude Type_of_order Type_of_vehicle Time_taken(min) \
0 75.912471 Snack motorcycle 24
1 77.813237 Snack scooter 33
2 77.688400 Drinks motorcycle 26
3 77.026494 Buffet motorcycle 21
4 80.289982 Snack scooter 30
distance
0 3.025149
1 20.183530
2 1.552758
3 7.790401
4 6.210138
Now let’s explore the data to find relationships between the features. I’ll start by looking at the relationship between the distance and time taken to deliver the food:
figure = px.scatter(data_frame = data,
x="distance",
y="Time_taken(min)",
size="Time_taken(min)",
trendline="ols",
title = "Relationship Between Distance and Time Taken")
figure.show()
There is a consistent relationship between the time taken and the distance travelled to deliver the food. It means that most delivery partners deliver food within 25-30 minutes, regardless of distance.
Now let’s have a look at the relationship between the time taken to deliver the food and the age of the delivery partner:
figure = px.scatter(data_frame = data,
x="Delivery_person_Age",
y="Time_taken(min)",
size="Time_taken(min)",
color = "distance",
trendline="ols",
title = "Relationship Between Time Taken and Age")
figure.show()
There is a linear relationship between the time taken to deliver the food and the age of the delivery partner. It means young delivery partners take less time to deliver the food compared to the elder partners.
Now let’s have a look at the relationship between the time taken to deliver the food and the ratings of the delivery partner:
figure = px.scatter(data_frame = data,
x="Delivery_person_Ratings",
y="Time_taken(min)",
size="Time_taken(min)",
color = "distance",
trendline="ols",
title = "Relationship Between Time Taken and Ratings")
figure.show()
There is an inverse linear relationship between the time taken to deliver the food and the ratings of the delivery partner. It means delivery partners with higher ratings take less time to deliver the food compared to partners with low ratings.
Now let’s have a look if the type of food ordered by the customer and the type of vehicle used by the delivery partner affects the delivery time or not:
fig = px.box(data,
x="Type_of_vehicle",
y="Time_taken(min)",
color="Type_of_order")
fig.show()
# pip install tensorflow
Defaulting to user installation because normal site-packages is not writeable Requirement already satisfied: tensorflow in c:\users\pavan\appdata\roaming\python\python310\site-packages (2.12.0) Requirement already satisfied: tensorflow-intel==2.12.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow) (2.12.0) Requirement already satisfied: termcolor>=1.1.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (2.2.0) Requirement already satisfied: google-pasta>=0.1.1 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (0.2.0) Requirement already satisfied: setuptools in d:\anaconda3\lib\site-packages (from tensorflow-intel==2.12.0->tensorflow) (65.6.3) Requirement already satisfied: astunparse>=1.6.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (1.6.3) Requirement already satisfied: h5py>=2.9.0 in d:\anaconda3\lib\site-packages (from tensorflow-intel==2.12.0->tensorflow) (3.7.0) Requirement already satisfied: libclang>=13.0.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (16.0.0) Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (4.22.3) Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (0.31.0) Requirement already satisfied: opt-einsum>=2.3.2 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (3.3.0) Requirement already satisfied: gast<=0.4.0,>=0.2.1 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (0.4.0) Requirement already satisfied: keras<2.13,>=2.12.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (2.12.0) Requirement already satisfied: numpy<1.24,>=1.22 in d:\anaconda3\lib\site-packages (from tensorflow-intel==2.12.0->tensorflow) (1.23.5) Requirement already satisfied: grpcio<2.0,>=1.24.3 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (1.54.0) Requirement already satisfied: absl-py>=1.0.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (1.4.0) Requirement already satisfied: wrapt<1.15,>=1.11.0 in d:\anaconda3\lib\site-packages (from tensorflow-intel==2.12.0->tensorflow) (1.14.1) Requirement already satisfied: tensorflow-estimator<2.13,>=2.12.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (2.12.0) Requirement already satisfied: packaging in d:\anaconda3\lib\site-packages (from tensorflow-intel==2.12.0->tensorflow) (22.0) Requirement already satisfied: six>=1.12.0 in d:\anaconda3\lib\site-packages (from tensorflow-intel==2.12.0->tensorflow) (1.16.0) Requirement already satisfied: typing-extensions>=3.6.6 in d:\anaconda3\lib\site-packages (from tensorflow-intel==2.12.0->tensorflow) (4.4.0) Requirement already satisfied: flatbuffers>=2.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (23.3.3) Requirement already satisfied: tensorboard<2.13,>=2.12 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (2.12.2) Requirement already satisfied: jax>=0.3.15 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorflow-intel==2.12.0->tensorflow) (0.4.8) Requirement already satisfied: wheel<1.0,>=0.23.0 in d:\anaconda3\lib\site-packages (from astunparse>=1.6.0->tensorflow-intel==2.12.0->tensorflow) (0.38.4) Requirement already satisfied: ml-dtypes>=0.0.3 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from jax>=0.3.15->tensorflow-intel==2.12.0->tensorflow) (0.1.0) Requirement already satisfied: scipy>=1.7 in d:\anaconda3\lib\site-packages (from jax>=0.3.15->tensorflow-intel==2.12.0->tensorflow) (1.10.0) Requirement already satisfied: markdown>=2.6.8 in d:\anaconda3\lib\site-packages (from tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (3.4.1) Requirement already satisfied: google-auth<3,>=1.6.3 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (2.17.3) Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (1.8.1) Requirement already satisfied: google-auth-oauthlib<1.1,>=0.5 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (1.0.0) Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (0.7.0) Requirement already satisfied: werkzeug>=1.0.1 in d:\anaconda3\lib\site-packages (from tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (2.2.2) Requirement already satisfied: requests<3,>=2.21.0 in d:\anaconda3\lib\site-packages (from tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (2.28.1) Requirement already satisfied: rsa<5,>=3.1.4 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from google-auth<3,>=1.6.3->tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (4.9) Requirement already satisfied: pyasn1-modules>=0.2.1 in d:\anaconda3\lib\site-packages (from google-auth<3,>=1.6.3->tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (0.2.8) Requirement already satisfied: cachetools<6.0,>=2.0.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from google-auth<3,>=1.6.3->tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (5.3.0) Requirement already satisfied: requests-oauthlib>=0.7.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from google-auth-oauthlib<1.1,>=0.5->tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (1.3.1) Requirement already satisfied: urllib3<1.27,>=1.21.1 in d:\anaconda3\lib\site-packages (from requests<3,>=2.21.0->tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (1.26.14) Requirement already satisfied: certifi>=2017.4.17 in d:\anaconda3\lib\site-packages (from requests<3,>=2.21.0->tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (2022.12.7) Requirement already satisfied: idna<4,>=2.5 in d:\anaconda3\lib\site-packages (from requests<3,>=2.21.0->tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (3.4) Requirement already satisfied: charset-normalizer<3,>=2 in d:\anaconda3\lib\site-packages (from requests<3,>=2.21.0->tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (2.0.4) Requirement already satisfied: MarkupSafe>=2.1.1 in d:\anaconda3\lib\site-packages (from werkzeug>=1.0.1->tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (2.1.1) Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in d:\anaconda3\lib\site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (0.4.8) Requirement already satisfied: oauthlib>=3.0.0 in c:\users\pavan\appdata\roaming\python\python310\site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<1.1,>=0.5->tensorboard<2.13,>=2.12->tensorflow-intel==2.12.0->tensorflow) (3.2.2) Note: you may need to restart the kernel to use updated packages.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
So there is not much difference between the time taken by delivery partners depending on the vehicle they are driving and the type of food they are delivering.
So the features that contribute most to the food delivery time based on our analysis are:
In the section below, we will learn how to train a Machine Learning model for food delivery time prediction.
Now let’s train a Machine Learning model using an LSTM neural network model for the task of food delivery time prediction:
#splitting data
from sklearn.model_selection import train_test_split
x = np.array(data[["Delivery_person_Age",
"Delivery_person_Ratings",
"distance"]])
y = np.array(data[["Time_taken(min)"]])
xtrain, xtest, ytrain, ytest = train_test_split(x, y,
test_size=0.10,
random_state=42)
# creating the LSTM neural network model
from keras.models import Sequential
from keras.layers import Dense, LSTM
model = Sequential()
model.add(LSTM(128, return_sequences=True, input_shape= (xtrain.shape[1], 1)))
model.add(LSTM(64, return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 3, 128) 66560
lstm_1 (LSTM) (None, 64) 49408
dense (Dense) (None, 25) 1625
dense_1 (Dense) (None, 1) 26
=================================================================
Total params: 117,619
Trainable params: 117,619
Non-trainable params: 0
_________________________________________________________________
# training the model
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(xtrain, ytrain, batch_size=1, epochs=9)
Epoch 1/9 41033/41033 [==============================] - 136s 3ms/step - loss: 69.1238 Epoch 2/9 41033/41033 [==============================] - 137s 3ms/step - loss: 63.8280 Epoch 3/9 41033/41033 [==============================] - 145s 4ms/step - loss: 61.2141 Epoch 4/9 41033/41033 [==============================] - 127s 3ms/step - loss: 61.1031 Epoch 5/9 41033/41033 [==============================] - 129s 3ms/step - loss: 60.0784 Epoch 6/9 41033/41033 [==============================] - 129s 3ms/step - loss: 59.4225 Epoch 7/9 41033/41033 [==============================] - 149s 4ms/step - loss: 59.3786 Epoch 8/9 41033/41033 [==============================] - 140s 3ms/step - loss: 59.1872 Epoch 9/9 41033/41033 [==============================] - 142s 3ms/step - loss: 58.6093
<keras.callbacks.History at 0x23721ef70a0>
Now let’s test the performance of our model by giving inputs to predict the food delivery time:
print("Food Delivery Time Prediction")
a = int(input("Age of Delivery Partner: "))
b = float(input("Ratings of Previous Deliveries: "))
c = int(input("Total Distance: "))
features = np.array([[a, b, c]])
print("Predicted Delivery Time in Minutes = ", model.predict(features))
Food Delivery Time Prediction Age of Delivery Partner: 34 Ratings of Previous Deliveries: 4 Total Distance: 8 1/1 [==============================] - 1s 543ms/step Predicted Delivery Time in Minutes = [[33.461246]]
So this is how you can use Machine Learning for the task of food delivery time prediction using the Python programming language.